DGCDflEBT HESUHE 



SD 051 310 



TH 000 621 



AUTHOR 

TITLE 

INSTITUTION 

PUB DATE 
NOTE 

AVAILABLE FROM 



EDHS PRICE 
DESCRIPTORS 



Gardner, ELic F. 

Measurement in Educatior; Interpreting Achievement 
Protiles - Uses and Warnings. 

National Council on Measurement in Education, East 
Lansing, Mich. 

Jan 70 

1 2 p. ; Special Report, vl n2 1970 

National Council on Measurement in Education, Office 
ot Evaluation Services, Michigan State University, 
East Lansing, Michigan 48823. $2.00 per year (4 
issues) ; Single issues 0.25 each in quantities of 25 
or more 

EDRS Price MF-S0.65 HC Not Available fro^j EDRS. 
♦Achievement, Achievement Tests, Comparative 
Analysis, Correlation, ♦Norms, Performance, ♦Profile 
Evaluation, Raw Scores, ♦Scores, Statistical 
Analysis, Student Records, ♦Test Interpretation, 

Test Results 



ABSTRACT 



The inter 
which indicate the overall 
individuals, is discussed, 
together with a three step 
interpreting test results, 
construct and appear to be 
is placed upon possible er 



pretat ; on of test profiles, graphic dev 
pe cf 0 * *nancc of an individual or group 
Some uses of profiles are presented, 
profile analysis procedure for 
Because profiles are relatively simple 

considerable empha 



easily 
rors in 



i n terpre ted , 
profile use. 



(AG) 



ices 

of 



to 

sis 



>TN 



* 

o 



Volume 1, No. 2 



K 



US DEPARTMENT OF HEALTH, 
EDUCATION WELf ARE 
OFFICE OF EDUCATION 
THIS DOCUMENT HAS BEEN REPRO 
OUCED EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIG 
INKING IT POINTS OF viEvy OR OPIN 
IONS STATFD DO NOT NECESSARILY 
REPRESE' T OFFICIAL OFFICE OF EDU 
CATION POSITION OR POLICY 



B 



January, 1970 



measurement in education 

A SERIES OF SPECIAL REPORTS OF THE NATIONAL COUNCIL ON MEASUREMENT IN EDUCATION 



Interpreting Achievement Profiles - 

Uses and Warnings 




1 



Eric F. Gardner 



ABOUT THIS HEHORT 

In the press of dealing with great quantities of 
information about large numbers of students there 
is a natural inclination tc seek ways to simplify and 
summarize. The test profile is valuable for this 
purpose because it can capture the essence of im- 
portant relationships and present them in a manner 
which is immediately apparent. In thai sense the 
profile has heuristic value which enhances under- 
standing of scaled test scores. 

Dr. Gardner emphasizes these virtues of the pro 
tile — particularly as they apply to achievement 
tests. He takes special care, however, to warn 
acainst corollary dangers in using profiles. Tnese 
stem mainly from oversimplifying and ovennter- 
preting information which is presented in distilled, 
visual form. As this report makes clear, test scores 
still have tne same measurement characteristics and 
limitations regardless ol the mode of presentation. 
This article enumerates specific cautions as well as 
useful advice on the use of profiles. 

The author is highly qualified to write on the 
subject. In his years as Professor and Chairman of 
the Psychology Department at Syracuse University 
Dr. Gardner has been recognized as a national 
authority on measurement in education. As a 
coauthor of the well-known Stanford Achieve- 
ment Test, fie has had unique eypHence in foster- 
ing good measurement practice. Th;s report con- 
tinues the author's tradition of putting that 
experience to good use. 



RjC_ 



WWW 



ERIC F. GARDNER 

The old Chinese saying that a picture is worth 
one thousand words is especially applicable to test 
profiles. Profiles are convenient ways of showing 
test scores; they ais graphic devices enabling us to 
see the over-all performance of an individual or 
group of individuals at a glance. They provide an 
excellent means for gaining a comprehensive pic- 
ture of a person's or class' strengths and weak- 
nesses. Profiles can be very helpful provided we use 
suitable caution in their interpretation. In one 
sense, a profile is like a good map whic., reflects 
Matures existing in reality, however, the appear- 
ance of such features on the profile does not 
guarantee their reality. An important point to 
remember is that although many of us find it 
surprisingly easy to believe that a score must be 
accurate if we have seen it on a test profile, its 
appearance on the profile does not make the score 
any more or less accurate or valid. 

In general, profiles are used when we wish to 
show two or more scores for the same person or 
two o r mo r e scores for groups of people. We may 
be interested m sets of scores obtained at the same 
time or sots of scores obtained at fixed intervals 
such as those of a student on . group of tests taken 
in successive grades. 

Profiles show the tests along one axis o' the 
graph and the score values along the other axis. 
Profile forms may show score values along the 
vertical axis or along the horizontal; there is no 
particular reason for p.eferring the one over the 
other so far as ease of reading is concerned. If, for 
example, we wish to prepare a profile sheet for our 
own class we would probably want to “st our test 
variables down the left hand side of the sheet and 
to plot our scores along ihe horizontal axis of the 
profile. We would do this because it is easier to 
write the complete test identification along a line 
than to write it along the narrow confines of a 
column. 




Since raw {or obtained scores) test scores may 
vary considerably «n meaning, it is obvious that raw 
1 scores cannot be used in plotting a profile. Before a 
profile can be piotted, than, it is clearly necessary 
to transform the scores to sets ot comparable 
values. There are two ways of doing this One is to 
scale the raw scores on the profile itself so that 
each scale has an equivalent mean and equivalent 
units of measurement. ; he other is i n convert the 
raw scores into some tvpe of derived scores before 
plotting them. The most common method is to use 
either standard scores, percentile ranks scaled to 
proportional standard score distances, or stanines. 
Mote that when this procedure has been followed, 
the standard scores, stanines, or percentile ranks 
must be based cn the same or strictly comparable 
populations all of whom have been tested at the 
same time. A discussion of comparability and other 
warnings about the construction of profiles and 
their intepretation will be presented later on in this 
paper. Let us now consider and illustrate several 
different kinds of useful profiles. 

SOME USES OF PROFILES 

1. To Obtain a Picture of the Relative Perfor- 
mance of a Pupil in Several Different Subjects 
or Areas. 

What should you look for in a profile? Is there a 
systematic way that you can analyze test results? 
The following three steps of analysis represent a 
good approach to the interpretation of test result?. 
Diagnose, Evaluate, Plan. The analysis of th? test 
profile in Figure 1 illustrates how these steps can 
be applied. 1 

Analysis of Susan K.'s Profile 

Figure 1 is a sample copy of the Pupil Stanine 
Profile. The three steps of analysis are applied to 
this profile as an illustration of how test scores of a 
pupil can be meaningfully interpreted. At this 
point I would like to call attention to the 
statement in the Stanford Achievement Manual 
which says, "When comparing two subtest stanines 
for an individual pupil, only differences of 2 or 
more stanine .evels should be considered significant 
by the teacher." 

Diagnose - Examine the profile for the most 
obvious subject strengths and weaknesses shown by 
the pupil's performance on this test battery. 

Susan, with plotted stanines of either 8 or 9, is 
achieving best in the areas of word meaning, 
paragraph meaning, spelling, word study skills, 
language, and the social studies. When compare I 
with other girls and boys of her grade Ic'el, Susan 

1 The foil swing illustration has been taken from Stanford 
Achievement Test, Teachers' Guide for Interpretation 
r J Use of Test Results , Harcourt, 8roce & World, Inc., 




Stanford Achievement T e: t 
Intermediate Complete Pottery 

Name Date of Testing Grade Placement Age 

K , Susan B. 2-26 4.6 9yr.6mo. 

Otis Quick-Scoring Mental Ability Test IQ 124, Stanine 8 



GRADE %-ILE 

SCORE HANK uTANINE 



Word Meaning j 


| 70 


94 


1 2 


3 


A 


5 6 7 


© 


9 


Paragraph Meaning 




9 & 


2 


3 


4 


5 6 7 


8 


fe) 


Spelling 


76 


96 


1 2 


3 


A 


5 6 7 


8 




A'ord Study Skill? 


7/ 


90 


1 ? 


3 


4 


5 6 7 




9 


Language 


12 


90 


1 ? 


3 


4 


5 6 7 




9 


Ar ilhmetic Computation 


si 


So 


1 2 


3 


4 


5 6 /J 


8 


9 


Arithmetic Concsois 


4 -o 


C34 


1 2 


3 


@75 6 7 


8 


9 


Arithmetic Application? 


49 


toZ 


1 2 


3 


1 




6 


g 


Social Studies 


17 


•^0 

CD 


1 ? 


3 


4 


5 6 7 




Science 


46 


46 


1 7 


0 


4 


t 


8 


9 

_ 



Figure 1. Stanine Profile for Susan K. 



shows average achievement in arithmetic concepts 
and in science, where she has stanines of either 4, 
5, or 6. She shows evidence of understanding the 
application of arithmetic with a stanine of 6 and 
shows considerable competence in arithmetic com 
putation with a stanine of 7. With an I.Q. of 124 
and a corresponding stanine of 8, Susan would 
normally be expected to achieve stanines of 7 or 
above in the various subjects. 

Evaluate — Relate the pupil's scores on the 
achievement test to sttch variables as your estimate 
of the pupil , his grades, his performance on a test 
of mental ability, and the like. 

Susan's test results indicate that she is a superior 
student in the language arts end in social studies. 
Her school marks and judgments of previous 
teachers should reflect this superiority. If the test 
was taken in the spring, have school marks through 
th? school year reflected this superiority? If not, 
why not? What are Susan's personal attitudes? Is 
she a non conformist? Does she excel in aspects of 
a subject not measured by the tests? Is Susan a 
highly verbal mernorizer? Is she a pocr reasoner in 
mathematics and science? What are her interests? 
Doesn't Susan need special encouragement and 
help in mathematics and science? These and other 
questions arise when test scores and other evalua 
tions do not c.onespond. 



“ 2 



Plan - Plan a program of classroom activities 
that iv ill remedy some of the obvious shortcomings 
and will build upon the greater strengths of each 
pupil. 

Diversity of interest in subject areas and of levels 
of achievement in them is inevitable and even 
desircble among pupils and within each individual 
pupil. But no pupil in the elementary grades, 
especially one of Susan's general level of ability, 
should fail to learn the fundamental subjects such 
as arithmetic. 

Because of Susan's outstanding work in the 
language arts, her teacher can be reasonably as- 
sured that her inability to score much above 
average in the area of science or in arithmetic 
concepts does not stem from any reading dif- 
ficulty. This then leaves the teacher at least two 
factors to consider; {ljlack of interest and (2) 
lack of fundamental knowledge about the under- 
lying concepts m science and in arithmetic. These 
posable detichnces could ha'*? resulted from in- 
adequate experience with these subjects, 
inadequate , merest evidenced in the home in these 
content areas, tack of a stirnul jting teacher of these 
subjects, and the like. 

The instructional problem here is a elatively 
simple one. A careful, thorough discussion with 
Susan should dim from her the evel of interest in 
tnese areas and also some tease ns for a lank of 
understanding in the basic concepts. Her shortages 
in know edge of mathematics concepts need to be 
diagnosed. As a result of such understanding, 
Susan's teacher will be able to build an instruction 
al program then will improve Susan's performance 
in these areas. 



2. To Compare the Performance of a Single 
Grade on Several Subjects and with the Na- 
tional Norm. 

One common use of achievement test batteries is 
in connection with some 'phase of administration 
or supervision. The supervisor is interested in 
knowing strengths and weaknesses in specific 
subjects so that they can be yiven grea f er attention. 
Frequently the national performance is accepted as 
a standard. Although national norms are useful as 
one frame of reference, it is important to recognize 
that achievement at the national average may well 
be an unreasonable goal for a particular school, 
class, or system. Norms are not designed to be 
standards nor should they be so designated unless a 
consideration of all relevant ven tables indicates 
they represent an appropriate level of average 
achievement lor a particular group o f students. 
Even then, by the very definition of a norm, it is 
e Q hat half the pupils will exceed it and half 

V ERJC 0W 



The authors of most achievement test batteries 
provide several scales for comparing local achieve- 
ment with national norms. We can usually expect 
to be furnished with grade equivalents, percentile 
ranks, standard scores, and stanines. in spite of 
their deficiencies and decrease in popularity, the 
grade equivalent is ^till the most commonly used 
frame of reference for evaluating local achieve- 
ment. 

In Figure 2 a single-grade profile is shown in 
which the deviations of the local school system 
medians from the corresponding national norma- 
tive values are plotted in months of grade equiva- 
lent above or below the norm at the time of 
testing. This profile, which represents the perform- 
ance of all fourth grade pupils tested the first of 
November (Grade 4.2) from a community of 
slightly better-than-average socioeconomic level 
and moderate size, indicates that achievement is 
jbove the national norm in al! areas. 

We have to note, however, that me average I Q. 
of this system was 110 on the Pintner General 
Abifhy Test, and the average age in this grade was 
three months younger than the national normative 
g r oup. Hence, it is pertinent to ask, "Is this group 
exceeding the national norm as much as would De 
exoected? ,r Many factors previously mentioned 
and others to be discussed later are at issue e.g. 



Grade Equivalent Scale 

4.0 5.0 6.0 7.0 8.0 




Figure 2. Profile of Fourth Grade Students (Tested 
November 1st) Plotted in Terms of Median Standard 
Scores Expressed as Grade Equivalents 



3 



3 



(1) comparability of grade equivalent units across 
subjects, (2) mental tests generally correlate dif- 
ferent^ with subject-matter achievement tests 
from area to area, (3) reliability of test scores. 

3. Schoolby-School Comparisons in Terms 
of Achievement Tests. 

The superintendent of schools, usually the per- 
son who eventually ha: to approve the purchase of 
test materials and the allocation of time for testing, 
is interested in knowing how his schools compare 
with each other and the national norm. As one 
important datum it is often helpful for him and the 
supervlsoi to have a school-by-school comparison 
based on standardized test results. 



Figure 3 shows such a distribution of standard 
scores by school for one subject (Spelling) in a 
small school system. 2 Medians have been com- 
puted for each school and these median standard 
scores have been circled and joined in order to 
make a profile. Although the standard score scales 
used here lack some of the deficiencies c>f grade 
equivalent scales and although we are dealmgwith 
medians rather than individual scores, we still have 
the typical problems associated with determining 
how large an observed difference must be to be 
meaningful. Some of these differences are so snail 
that tney can be considered chance differences. 
Others are so substantial that they would undoubt- 
edly maintain upon retesting, /v similar profile 
could be made for each class within a specific 
grade. 

It is desirable for a school system to can / out 
such a testing program for several years using 
alternate forms of the same batteries. By relating 
this kind of achievement test information toother 
factors such as socio-economic status, aptitude 
measures, ethnic composition and differences in 
the characteristics of the instructional staff, the 
administration will gain an increasingly dependable 
idea of such school by school variations. Some of 
these differences, which may be rooted in the 
background of abilities that the children bring to 
school, require the focusing of special efforts and 
resources in particular schools to achieve sat- 
isfactory remediation. 



2 Adapted from the manual of the Metropolitan 
Q tenement Test , Harcourt, Brace & World, 1962. 

ERIC 




Figure 3. School-by-School Comparison of Test Re- 
sults for Grade 6.7 in a Single Community. Showing 
Distributions of Stand, “d Scores for the Spelling Test 
in the Metropolitan Intermediate Battery. 



4 . Comparing a Pupil's Performance in 
Successive Years. 

Not only is the profile a useful device for 
portraying the differential performance of a pupil 
(class or school) on the subtests of an achievement 
test battery, it can be used to show profiles of the 
same pupil for several years. The Teacher's Manual 
for the Iowa Tests of Basic Skills presents a 
standard permanent profile chart on which are 

A' 



